feat: data batch processing of number assessment events#2292
Conversation
|
Warning Rate limit exceeded@jo-elimu has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 15 minutes and 58 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
## Walkthrough
A new scheduled import task for number assessment events from CSV files was introduced, including a new JPA relationship between NumberAssessmentEvent and Number. Several other event import schedulers had their execution times updated via cron expressions and comments. A new CSV extraction method and corresponding test were added to support number assessment event data parsing.
## Changes
| File(s) | Change Summary |
|------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------|
| src/main/java/ai/elimu/entity/analytics/NumberAssessmentEvent.java | Added a `@ManyToOne` relationship field `number` to the entity. |
| src/main/java/ai/elimu/tasks/analytics/NumberAssessmentEventImportScheduler.java | Introduced a scheduled service to import number assessment events from CSV files, with deduplication and Discord notifications. |
| src/main/java/ai/elimu/tasks/analytics/WordAssessmentEventImportScheduler.java<br>src/main/java/ai/elimu/tasks/analytics/WordLearningEventImportScheduler.java<br>src/main/java/ai/elimu/tasks/analytics/VideoLearningEventImportScheduler.java<br>src/main/java/ai/elimu/tasks/analytics/StoryBookLearningEventImportScheduler.java | Updated cron expressions and comments to correct scheduled execution times for existing import tasks. |
| src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java | Added method to extract NumberAssessmentEvent entities from CSV files. |
| src/test/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelperTest.java | Added test method verifying extraction of NumberAssessmentEvent entities from CSV files. |
| src/main/java/ai/elimu/web/analytics/MainAnalyticsController.java | Added NumberAssessmentEventDao dependency and included number assessment event count in analytics model. |
| src/main/java/ai/elimu/web/servlet/CustomDispatcherServlet.java | Added generation and persistence of dummy NumberAssessmentEvent data during DB population for development. |
| src/main/resources/META-INF/jpa-schema-export.sql | Added `number_id` column and foreign key constraint to `NumberAssessmentEvent` table referencing `Number` table. |
| src/main/java/ai/elimu/web/analytics/students/StudentController.java | Added NumberAssessmentEventDao dependency and weekly aggregation of number assessment events for student analytics.|
| src/main/webapp/WEB-INF/jsp/analytics/students/id.jsp | Added UI section for number assessment events with export button, chart, and recent events table. |
## Sequence Diagram(s)
```mermaid
sequenceDiagram
participant Scheduler as NumberAssessmentEventImportScheduler
participant FileSystem as Local File System
participant CsvHelper as CsvAnalyticsExtractionHelper
participant DB as Database
participant Discord as DiscordService
Scheduler->>FileSystem: Scan for CSV files in import directory
loop For each CSV file
Scheduler->>CsvHelper: Extract NumberAssessmentEvent entities
loop For each event
Scheduler->>DB: Check for existing event by timestamp, Android ID, package name
alt Event does not exist
Scheduler->>DB: Ensure Student exists (create if needed)
Scheduler->>DB: Resolve Number entity (if ID present)
Scheduler->>DB: Persist NumberAssessmentEvent
end
end
Scheduler->>Discord: Post summary message with analytics link
end
alt Exception occurs
Scheduler->>Discord: Notify error
endPossibly related PRs
Suggested reviewers
|
There was a problem hiding this comment.
Actionable comments posted: 3
🔭 Outside diff range comments (1)
src/main/java/ai/elimu/tasks/analytics/StoryBookLearningEventImportScheduler.java (1)
65-66: Guard againstnullfromlistFiles()
File#listFiles()returnsnullon I/O errors or when the path is not a directory.
Dereferencing without a check risks an NPE that would halt the whole import cycle.- for (File analyticsDirFile : analyticsDir.listFiles()) { + File[] androidDirs = analyticsDir.listFiles(); + if (androidDirs == null) { + log.warn("Could not list analytics sub-directories in {}", analyticsDir); + return; + } + for (File analyticsDirFile : androidDirs) {
🧹 Nitpick comments (3)
src/main/java/ai/elimu/tasks/analytics/VideoLearningEventImportScheduler.java (1)
54-54: Externalise the cron expression for easier ops tuningHard-coding the cron schedule makes future rescheduling require a full redeploy. Consider using a property placeholder so ops can tweak timing without code changes:
- @Scheduled(cron = "00 50 * * * *") // 50 minutes past every hour + // 50 minutes past every hour (default). Override via application properties if needed. + @Scheduled(cron = "${analytics.video-learning-event-import.cron:0 50 * * * *}")This follows the same pattern several of our other schedulers already use and keeps runtime configuration consistent across environments.
src/main/java/ai/elimu/tasks/analytics/WordAssessmentEventImportScheduler.java (1)
54-55: Externalize the cron expression for easier ops tuningHard-coding the schedule requires a redeploy for even minor timing tweaks. Move the value to
application.properties(e.g.scheduler.wordAssessmentEventImport.cron=0 35 * * * *) and reference it with
@Scheduled(cron = "${scheduler.wordAssessmentEventImport.cron}").
This keeps the code clean while giving ops full control.src/main/java/ai/elimu/tasks/analytics/StoryBookLearningEventImportScheduler.java (1)
81-88: Delete or archive processed CSVs to cut repeated disk scansEven though duplicates are skipped at DB level, every execution re-parses the same CSVs, wasting I/O and CPU.
After a successful import, move the file to animported/folder (or delete) so subsequent runs only touch new data.
This also simplifies manual troubleshooting.If desired, I can draft the refactor.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
src/main/java/ai/elimu/entity/analytics/NumberAssessmentEvent.java(2 hunks)src/main/java/ai/elimu/tasks/analytics/NumberAssessmentEventImportScheduler.java(1 hunks)src/main/java/ai/elimu/tasks/analytics/StoryBookLearningEventImportScheduler.java(1 hunks)src/main/java/ai/elimu/tasks/analytics/VideoLearningEventImportScheduler.java(1 hunks)src/main/java/ai/elimu/tasks/analytics/WordAssessmentEventImportScheduler.java(1 hunks)src/main/java/ai/elimu/tasks/analytics/WordLearningEventImportScheduler.java(1 hunks)src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java(1 hunks)
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: jo-elimu
PR: elimu-ai/webapp#2237
File: src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java:287-288
Timestamp: 2025-06-09T06:52:22.910Z
Learning: VideoLearningEvents in CsvAnalyticsExtractionHelper never used the "time" column name and always use "timestamp", unlike other event types (LetterSoundLearningEvent, WordAssessmentEvent, WordLearningEvent, StoryBookLearningEvent) which need conditional logic to handle both "time" (for versions < 3.4.0) and "timestamp" (for versions >= 3.4.0) column names. This is because VideoLearningEvents were introduced after the schema change.
src/main/java/ai/elimu/tasks/analytics/WordLearningEventImportScheduler.java (1)
Learnt from: jo-elimu
PR: elimu-ai/webapp#2237
File: src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java:287-288
Timestamp: 2025-06-09T06:52:22.910Z
Learning: VideoLearningEvents in CsvAnalyticsExtractionHelper never used the "time" column name and always use "timestamp", unlike other event types (LetterSoundLearningEvent, WordAssessmentEvent, WordLearningEvent, StoryBookLearningEvent) which need conditional logic to handle both "time" (for versions < 3.4.0) and "timestamp" (for versions >= 3.4.0) column names. This is because VideoLearningEvents were introduced after the schema change.
src/main/java/ai/elimu/tasks/analytics/StoryBookLearningEventImportScheduler.java (1)
Learnt from: jo-elimu
PR: elimu-ai/webapp#2237
File: src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java:287-288
Timestamp: 2025-06-09T06:52:22.910Z
Learning: VideoLearningEvents in CsvAnalyticsExtractionHelper never used the "time" column name and always use "timestamp", unlike other event types (LetterSoundLearningEvent, WordAssessmentEvent, WordLearningEvent, StoryBookLearningEvent) which need conditional logic to handle both "time" (for versions < 3.4.0) and "timestamp" (for versions >= 3.4.0) column names. This is because VideoLearningEvents were introduced after the schema change.
src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java (2)
Learnt from: jo-elimu
PR: elimu-ai/webapp#2237
File: src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java:287-288
Timestamp: 2025-06-09T06:52:22.910Z
Learning: VideoLearningEvents in CsvAnalyticsExtractionHelper never used the "time" column name and always use "timestamp", unlike other event types (LetterSoundLearningEvent, WordAssessmentEvent, WordLearningEvent, StoryBookLearningEvent) which need conditional logic to handle both "time" (for versions < 3.4.0) and "timestamp" (for versions >= 3.4.0) column names. This is because VideoLearningEvents were introduced after the schema change.
Learnt from: jo-elimu
PR: elimu-ai/webapp#2269
File: src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java:80-88
Timestamp: 2025-06-20T05:28:00.962Z
Learning: In CsvAnalyticsExtractionHelper.java, the CSV columns for research_experiment and experiment_group contain integer values (0 or 1) that need to be mapped to enum ordinals by adding 1 to the CSV value (0 maps to ordinal 1, 1 maps to ordinal 2).
src/main/java/ai/elimu/tasks/analytics/WordAssessmentEventImportScheduler.java (1)
Learnt from: jo-elimu
PR: elimu-ai/webapp#2237
File: src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java:287-288
Timestamp: 2025-06-09T06:52:22.910Z
Learning: VideoLearningEvents in CsvAnalyticsExtractionHelper never used the "time" column name and always use "timestamp", unlike other event types (LetterSoundLearningEvent, WordAssessmentEvent, WordLearningEvent, StoryBookLearningEvent) which need conditional logic to handle both "time" (for versions < 3.4.0) and "timestamp" (for versions >= 3.4.0) column names. This is because VideoLearningEvents were introduced after the schema change.
src/main/java/ai/elimu/tasks/analytics/VideoLearningEventImportScheduler.java (1)
Learnt from: jo-elimu
PR: elimu-ai/webapp#2237
File: src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java:287-288
Timestamp: 2025-06-09T06:52:22.910Z
Learning: VideoLearningEvents in CsvAnalyticsExtractionHelper never used the "time" column name and always use "timestamp", unlike other event types (LetterSoundLearningEvent, WordAssessmentEvent, WordLearningEvent, StoryBookLearningEvent) which need conditional logic to handle both "time" (for versions < 3.4.0) and "timestamp" (for versions >= 3.4.0) column names. This is because VideoLearningEvents were introduced after the schema change.
🧬 Code Graph Analysis (1)
src/main/java/ai/elimu/tasks/analytics/NumberAssessmentEventImportScheduler.java (2)
src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java (1)
Slf4j(35-624)src/main/java/ai/elimu/tasks/analytics/WordLearningEventImportScheduler.java (1)
Service(45-127)
⏰ Context from checks skipped due to timeout of 90000ms (12)
- GitHub Check: test_rest
- GitHub Check: test_rest
- GitHub Check: test_rest
- GitHub Check: test_rest
- GitHub Check: test_rest
- GitHub Check: build (macos-latest, 17)
- GitHub Check: build (ubuntu-latest, 17)
- GitHub Check: test_rest
- GitHub Check: build (ubuntu-latest, 21)
- GitHub Check: build (windows-latest, 17)
- GitHub Check: build (windows-latest, 21)
- GitHub Check: build (macos-latest, 21)
🔇 Additional comments (4)
src/main/java/ai/elimu/tasks/analytics/WordAssessmentEventImportScheduler.java (1)
54-54: Cron time update acknowledgedThe new
"00 35 * * * *"expression correctly shifts the task to 35 minutes past the hour and matches the inline comment.src/main/java/ai/elimu/tasks/analytics/WordLearningEventImportScheduler.java (1)
54-54: LGTM!The scheduling change is correctly implemented with the comment accurately reflecting the new execution time.
src/main/java/ai/elimu/entity/analytics/NumberAssessmentEvent.java (1)
3-5: LGTM!The JPA relationship is correctly implemented following standard patterns. The placement after the
numberIdfield is logical and maintains consistency with the existing code structure.Also applies to: 33-34
src/main/java/ai/elimu/tasks/analytics/NumberAssessmentEventImportScheduler.java (1)
45-127: Implementation follows established patterns correctly.The scheduler implementation is well-structured and consistent with other event import schedulers in the codebase. The error handling, duplicate checking, and Discord notifications are properly implemented.
| @Scheduled(cron = "00 45 * * * *") // 45 minutes past every hour | ||
| public synchronized void execute() { |
There was a problem hiding this comment.
💡 Verification agent
❓ Verification inconclusive
Specify a timezone to avoid server-dependent trigger drift
@Scheduled defaults to the JVM default timezone, which varies between environments (local laptop, staging, prod container).
Explicitly set zone = "UTC" (or your canonical TZ) to guarantee the job always fires at exactly 45 min past the hour independent of host configuration.
- @Scheduled(cron = "00 45 * * * *") // 45 minutes past every hour
+ @Scheduled(cron = "00 45 * * * *", zone = "UTC") // 45 minutes past every UTC hour#!/bin/bash
set -e
1. Verify that @scheduled is imported from Spring
grep -R "import org.springframework.scheduling.annotation.Scheduled" -n src
2. Check for Spring Boot parent or Spring Framework version in pom.xml
grep -R "spring-boot-starter-parent" -n pom.xml -A2 || echo "No Spring Boot parent found"
grep -R "" -n pom.xml | grep -A2 "springframework" || echo "No direct Spring Framework dependency found"
#!/bin/bash
set -e
1. Confirm that @scheduled is imported in the scheduler class
grep -n "import org.springframework.scheduling.annotation.Scheduled" src/main/java/ai/elimu/tasks/analytics/StoryBookLearningEventImportScheduler.java
2. Check for Spring Boot parent version in pom.xml
grep -n "<spring-boot-starter-parent" -n pom.xml -A1 || echo "No spring-boot-starter-parent found"
Add explicit timezone to @scheduled to avoid trigger drift
Spring’s @scheduled defaults to the JVM’s time zone, which can vary across environments. To ensure this job always runs at 45 minutes past the hour in UTC, add the zone attribute.
• File: src/main/java/ai/elimu/tasks/analytics/StoryBookLearningEventImportScheduler.java (lines 54–55)
Apply this change:
- @Scheduled(cron = "00 45 * * * *") // 45 minutes past every hour
+ @Scheduled(cron = "00 45 * * * *", zone = "UTC") // 45 minutes past every UTC hour📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @Scheduled(cron = "00 45 * * * *") // 45 minutes past every hour | |
| public synchronized void execute() { | |
| @Scheduled(cron = "00 45 * * * *", zone = "UTC") // 45 minutes past every UTC hour | |
| public synchronized void execute() { |
🤖 Prompt for AI Agents
In
src/main/java/ai/elimu/tasks/analytics/StoryBookLearningEventImportScheduler.java
at lines 54-55, the @Scheduled annotation lacks an explicit timezone, which can
cause the scheduled task to trigger at different times depending on the JVM's
default timezone. Fix this by adding the zone attribute to the @Scheduled
annotation, setting it to "UTC" to ensure consistent execution at 45 minutes
past every hour regardless of environment.
| log.info("csvFile: " + csvFile); | ||
|
|
||
| // Convert from CSV to Java | ||
| List<NumberAssessmentEvent> events = CsvAnalyticsExtractionHelper.extractNumberAssessmentEvents(csvFile); |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Method extractNumberAssessmentEvents is not implemented.
The method CsvAnalyticsExtractionHelper.extractNumberAssessmentEvents called here doesn't exist in the helper class, which will cause a compilation error.
Let me verify if this method exists elsewhere in the codebase:
🏁 Script executed:
#!/bin/bash
# Search for the extractNumberAssessmentEvents method definition
ast-grep --pattern 'extractNumberAssessmentEvents($$$) {
$$$
}'
# Also search with ripgrep in case the pattern doesn't match
rg -A 5 "extractNumberAssessmentEvents"Length of output: 955
Implement the missing extractNumberAssessmentEvents method
The call at
src/main/java/ai/elimu/tasks/analytics/NumberAssessmentEventImportScheduler.java:77
will fail to compile becauseCsvAnalyticsExtractionHelper.extractNumberAssessmentEventsdoes not exist.
Action items:
- Add a new method in
src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java:public static List<NumberAssessmentEvent> extractNumberAssessmentEvents(File csvFile) throws IOException { // parse the CSV (similar to other extractXXX methods) // map rows to NumberAssessmentEvent instances }
- Use the existing extraction methods (e.g.
extractWordLearningEvents,extractLetterSoundLearningEvents) as a template for CSV parsing logic and error handling.
🤖 Prompt for AI Agents
In
src/main/java/ai/elimu/tasks/analytics/NumberAssessmentEventImportScheduler.java
at line 77, the call to
CsvAnalyticsExtractionHelper.extractNumberAssessmentEvents fails because the
method does not exist. To fix this, add a new static method named
extractNumberAssessmentEvents in
src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java that accepts a
File parameter and returns a List of NumberAssessmentEvent. Implement CSV
parsing logic similar to existing methods like extractWordLearningEvents,
mapping CSV rows to NumberAssessmentEvent objects, and include appropriate error
handling such as throwing IOException.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2292 +/- ##
============================================
+ Coverage 16.37% 16.57% +0.19%
- Complexity 454 459 +5
============================================
Files 257 259 +2
Lines 7546 7718 +172
Branches 872 894 +22
============================================
+ Hits 1236 1279 +43
- Misses 6242 6370 +128
- Partials 68 69 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
src/test/resources/ai/elimu/util/csv/5b7c682a12ecbe2e_4000028_number-assessment-events_2025-07-03.csvis excluded by!**/*.csv,!**/*.csv
📒 Files selected for processing (3)
pom-dependency-tree.txt(1 hunks)src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java(2 hunks)src/test/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelperTest.java(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- pom-dependency-tree.txt
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: jo-elimu
PR: elimu-ai/webapp#2237
File: src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java:287-288
Timestamp: 2025-06-09T06:52:22.910Z
Learning: VideoLearningEvents in CsvAnalyticsExtractionHelper never used the "time" column name and always use "timestamp", unlike other event types (LetterSoundLearningEvent, WordAssessmentEvent, WordLearningEvent, StoryBookLearningEvent) which need conditional logic to handle both "time" (for versions < 3.4.0) and "timestamp" (for versions >= 3.4.0) column names. This is because VideoLearningEvents were introduced after the schema change.
src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java (4)
Learnt from: jo-elimu
PR: elimu-ai/webapp#2269
File: src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java:80-88
Timestamp: 2025-06-20T05:28:00.962Z
Learning: In CsvAnalyticsExtractionHelper.java, the CSV columns for research_experiment and experiment_group contain integer values (0 or 1) that need to be mapped to enum ordinals by adding 1 to the CSV value (0 maps to ordinal 1, 1 maps to ordinal 2).
Learnt from: jo-elimu
PR: elimu-ai/webapp#2237
File: src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java:287-288
Timestamp: 2025-06-09T06:52:22.910Z
Learning: VideoLearningEvents in CsvAnalyticsExtractionHelper never used the "time" column name and always use "timestamp", unlike other event types (LetterSoundLearningEvent, WordAssessmentEvent, WordLearningEvent, StoryBookLearningEvent) which need conditional logic to handle both "time" (for versions < 3.4.0) and "timestamp" (for versions >= 3.4.0) column names. This is because VideoLearningEvents were introduced after the schema change.
Learnt from: venkatesh2k3
PR: elimu-ai/webapp#1682
File: src/main/java/ai/elimu/dao/LetterSoundContributionEventDao.java:13-13
Timestamp: 2024-07-06T17:37:44.413Z
Learning: In the `LetterSoundContributionEventDao.java` file, the method signatures should use `LetterSoundContributionEvent` instead of `LetterSoundCorrespondenceContributionEvent` to maintain consistency with the renaming of entities.
Learnt from: jo-elimu
PR: elimu-ai/webapp#0
File: :0-0
Timestamp: 2025-05-03T08:01:30.217Z
Learning: The elimu-ai/webapp repository has a dependency on ai.elimu:model (version 2.0.97) that provides shared model classes and enums. The Language enum should be imported from ai.elimu.model.v2.enums.Language instead of creating local duplicates.
src/test/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelperTest.java (3)
Learnt from: jo-elimu
PR: elimu-ai/webapp#2269
File: src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java:80-88
Timestamp: 2025-06-20T05:28:00.962Z
Learning: In CsvAnalyticsExtractionHelper.java, the CSV columns for research_experiment and experiment_group contain integer values (0 or 1) that need to be mapped to enum ordinals by adding 1 to the CSV value (0 maps to ordinal 1, 1 maps to ordinal 2).
Learnt from: jo-elimu
PR: elimu-ai/webapp#2237
File: src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java:287-288
Timestamp: 2025-06-09T06:52:22.910Z
Learning: VideoLearningEvents in CsvAnalyticsExtractionHelper never used the "time" column name and always use "timestamp", unlike other event types (LetterSoundLearningEvent, WordAssessmentEvent, WordLearningEvent, StoryBookLearningEvent) which need conditional logic to handle both "time" (for versions < 3.4.0) and "timestamp" (for versions >= 3.4.0) column names. This is because VideoLearningEvents were introduced after the schema change.
Learnt from: venkatesh2k3
PR: elimu-ai/webapp#1682
File: src/main/java/ai/elimu/dao/LetterSoundContributionEventDao.java:13-13
Timestamp: 2024-07-06T17:37:44.413Z
Learning: In the `LetterSoundContributionEventDao.java` file, the method signatures should use `LetterSoundContributionEvent` instead of `LetterSoundCorrespondenceContributionEvent` to maintain consistency with the renaming of entities.
🧬 Code Graph Analysis (1)
src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java (1)
src/main/java/ai/elimu/util/AnalyticsHelper.java (1)
AnalyticsHelper(6-43)
⏰ Context from checks skipped due to timeout of 90000ms (12)
- GitHub Check: test_rest
- GitHub Check: test_rest
- GitHub Check: test_rest
- GitHub Check: test_rest
- GitHub Check: test_rest
- GitHub Check: test_rest
- GitHub Check: build (windows-latest, 21)
- GitHub Check: build (windows-latest, 17)
- GitHub Check: build (ubuntu-latest, 21)
- GitHub Check: build (ubuntu-latest, 17)
- GitHub Check: build (macos-latest, 17)
- GitHub Check: build (macos-latest, 21)
🔇 Additional comments (4)
src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java (1)
5-5: LGTM! Import correctly added.The import for
NumberAssessmentEventis properly placed in alphabetical order and is required for the new extraction method.src/test/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelperTest.java (3)
5-5: LGTM: Import additions are appropriate.The new imports for
assertTrue,JSONObject, andNumberAssessmentEventare correctly added and necessary for the new test method functionality.Also applies to: 11-11, 19-19
220-220: TODO comment is consistent with class structure.The TODO comment for "number learning events" maintains consistency with the existing TODO structure at line 30 for "letter-sound assessment events."
192-218: Confirm existence and content of the test CSV resourceI wasn’t able to locate the file
src/test/resources/ai/elimu/util/csv/5b7c682a12ecbe2e_4000028_number-assessment-events_2025-07-03.csv
in the cloned repo. Please make sure this CSV is checked in at the expected path and that its header + first data row match the test’s expectations:
- numberSelected column = 6
- numberValue column = 9
- numberId column is empty/null
Once you confirm the file exists and those values align, the test’s assertions (including
getNumberId() == null) will be valid.
| public static List<NumberAssessmentEvent> extractNumberAssessmentEvents(File csvFile) { | ||
| log.info("extractNumberAssessmentEvents"); | ||
|
|
||
| Integer versionCode = AnalyticsHelper.extractVersionCodeFromCsvFilename(csvFile.getName()); | ||
| log.info("versionCode: " + versionCode); | ||
|
|
||
| List<NumberAssessmentEvent> numberAssessmentEvents = new ArrayList<>(); | ||
|
|
||
| // Iterate each row in the CSV file | ||
| Path csvFilePath = Paths.get(csvFile.toURI()); | ||
| log.info("csvFilePath: " + csvFilePath); | ||
| try { | ||
| Reader reader = Files.newBufferedReader(csvFilePath); | ||
| CSVFormat csvFormat = CSVFormat.DEFAULT.withFirstRecordAsHeader(); | ||
| log.info("header: " + Arrays.toString(csvFormat.getHeader())); | ||
| CSVParser csvParser = new CSVParser(reader, csvFormat); | ||
| for (CSVRecord csvRecord : csvParser) { | ||
| log.info("csvRecord: " + csvRecord); | ||
|
|
||
| // Convert from CSV to Java | ||
|
|
||
| NumberAssessmentEvent numberAssessmentEvent = new NumberAssessmentEvent(); | ||
|
|
||
| long timestampInMillis = Long.valueOf(csvRecord.get("timestamp").substring(0, 10)) * 1_000; | ||
| Calendar timestamp = Calendar.getInstance(TimeZone.getTimeZone("UTC")); | ||
| timestamp.setTimeInMillis(timestampInMillis); | ||
| numberAssessmentEvent.setTimestamp(timestamp); | ||
|
|
||
| String androidId = AnalyticsHelper.extractAndroidIdFromCsvFilename(csvFile.getName()); | ||
| numberAssessmentEvent.setAndroidId(androidId); | ||
|
|
||
| String packageName = csvRecord.get("package_name"); | ||
| numberAssessmentEvent.setPackageName(packageName); | ||
|
|
||
| Float masteryScore = Float.valueOf(csvRecord.get("mastery_score")); | ||
| numberAssessmentEvent.setMasteryScore(masteryScore); | ||
|
|
||
| Long timeSpentMs = Long.valueOf(csvRecord.get("time_spent_ms")); | ||
| numberAssessmentEvent.setTimeSpentMs(timeSpentMs); | ||
|
|
||
| String additionalData = csvRecord.get("additional_data"); | ||
| if (StringUtils.isNotBlank(additionalData)) { | ||
| numberAssessmentEvent.setAdditionalData(additionalData); | ||
| } | ||
|
|
||
| int researchExperimentOrdinal = Integer.valueOf(csvRecord.get("research_experiment")); | ||
| ResearchExperiment researchExperiment = ResearchExperiment.values()[researchExperimentOrdinal]; | ||
| numberAssessmentEvent.setResearchExperiment(researchExperiment); | ||
|
|
||
| int experimentGroupOrdinal = Integer.valueOf(csvRecord.get("experiment_group")); | ||
| ExperimentGroup experimentGroup = ExperimentGroup.values()[experimentGroupOrdinal]; | ||
| numberAssessmentEvent.setExperimentGroup(experimentGroup); | ||
|
|
||
| Integer numberValue = Integer.valueOf(csvRecord.get("number_value")); | ||
| numberAssessmentEvent.setNumberValue(numberValue); | ||
|
|
||
| // String numberSymbol = csvRecord.get("number_symbol"); | ||
| // numberAssessmentEvent.setNumberSymbol(numberSymbol); | ||
|
|
||
| if (StringUtils.isNotBlank(csvRecord.get("number_id"))) { | ||
| Long numberId = Long.valueOf(csvRecord.get("number_id")); | ||
| numberAssessmentEvent.setNumberId(numberId); | ||
| } | ||
|
|
||
| numberAssessmentEvents.add(numberAssessmentEvent); | ||
| } | ||
| csvParser.close(); | ||
| } catch (IOException ex) { | ||
| log.error(ex.getMessage()); | ||
| } | ||
|
|
||
| return numberAssessmentEvents; | ||
| } |
There was a problem hiding this comment.
Fix critical inconsistencies with established assessment event extraction patterns.
The implementation has several issues that make it inconsistent with other assessment event extractors in this class:
- Missing version-based timestamp handling (line 225) - should use conditional logic like other assessment events
- Missing version check for additionalData (lines 242-245) - should check
versionCode >= 3006000 - Missing version check and null validation for research experiments (lines 247-253) - should check
versionCode >= 3005009and useStringUtils.isNotBlank() - Unexplained commented code (lines 258-259) -
numberSymbolfield is commented without explanation
Apply this diff to fix the inconsistencies and align with the extractWordAssessmentEvents pattern:
- long timestampInMillis = Long.valueOf(csvRecord.get("timestamp").substring(0, 10)) * 1_000;
+ String timestampColumnName = null;
+ if (versionCode < 3004000) {
+ timestampColumnName = "time";
+ } else {
+ // https://github.com/elimu-ai/analytics/releases/tag/3.4.0
+ timestampColumnName = "timestamp";
+ }
+ long timestampInMillis = Long.valueOf(csvRecord.get(timestampColumnName).substring(0, 10)) * 1_000;- String additionalData = csvRecord.get("additional_data");
- if (StringUtils.isNotBlank(additionalData)) {
- numberAssessmentEvent.setAdditionalData(additionalData);
- }
+ if (versionCode >= 3006000) {
+ // https://github.com/elimu-ai/analytics/releases/tag/3.6.0
+ String additionalData = csvRecord.get("additional_data");
+ if (StringUtils.isNotBlank(additionalData)) {
+ numberAssessmentEvent.setAdditionalData(additionalData);
+ }
+ }- int researchExperimentOrdinal = Integer.valueOf(csvRecord.get("research_experiment"));
- ResearchExperiment researchExperiment = ResearchExperiment.values()[researchExperimentOrdinal];
- numberAssessmentEvent.setResearchExperiment(researchExperiment);
-
- int experimentGroupOrdinal = Integer.valueOf(csvRecord.get("experiment_group"));
- ExperimentGroup experimentGroup = ExperimentGroup.values()[experimentGroupOrdinal];
- numberAssessmentEvent.setExperimentGroup(experimentGroup);
+ if (versionCode >= 3005009) {
+ // https://github.com/elimu-ai/analytics/releases/tag/3.5.9
+
+ if (StringUtils.isNotBlank(csvRecord.get("research_experiment"))) {
+ int researchExperimentOrdinal = Integer.valueOf(csvRecord.get("research_experiment"));
+ ResearchExperiment researchExperiment = ResearchExperiment.values()[researchExperimentOrdinal];
+ numberAssessmentEvent.setResearchExperiment(researchExperiment);
+ }
+
+ if (StringUtils.isNotBlank(csvRecord.get("experiment_group"))) {
+ int experimentGroupOrdinal = Integer.valueOf(csvRecord.get("experiment_group"));
+ ExperimentGroup experimentGroup = ExperimentGroup.values()[experimentGroupOrdinal];
+ numberAssessmentEvent.setExperimentGroup(experimentGroup);
+ }
+ }- // String numberSymbol = csvRecord.get("number_symbol");
- // numberAssessmentEvent.setNumberSymbol(numberSymbol);
+ String numberSymbol = csvRecord.get("number_symbol");
+ numberAssessmentEvent.setNumberSymbol(numberSymbol);🤖 Prompt for AI Agents
In src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java between
lines 202 and 274, fix inconsistencies by adding version-based conditional logic
for timestamp extraction similar to other assessment event extractors, ensure
additionalData is only set if versionCode is at least 3006000, add a versionCode
check (>= 3005009) and use StringUtils.isNotBlank() before setting
researchExperiment, and remove or properly explain the commented-out
numberSymbol code to maintain clarity and consistency.
Issue Number
Purpose
Technical Details
Testing Instructions
Screenshots
Format Checks
Note
Files in PRs are automatically checked for format violations with
mvn spotless:check.If this PR contains files with format violations, run
mvn spotless:applyto fix them.